🤖 feat: add deterministic stream guardrails for verification and doom loops#2476
🤖 feat: add deterministic stream guardrails for verification and doom loops#2476ibetitsmike wants to merge 8 commits intomainfrom
Conversation
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 29df8afc44
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@codex review Addressed feedback: tightened |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e99336f295
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@codex review Addressed: new-file creation in |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a503345873
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@codex review Addressed: validation commands are now recognized after shell operators ( |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ece687522d
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@codex review Addressed: removed |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e27c3a7853
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@codex review Addressed with a clarifying code comment. Shell command parsing with regex is inherently imprecise — environment prefixes ( |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 16a071c69a
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
Pausing the review loop here and leaving this for human direction. Re: bash-based edit tracking — This was deliberately scoped out in the original plan. Tracking file edits via bash would require parsing arbitrary shell commands to figure out which files they write to (redirections, The The alternative (heuristically guessing which files bash commands modify) would introduce a high false-positive rate and make the guardrail unreliable in the other direction. |
…mand run_and_report is a generic wrapper, not inherently a validation command. Only match when the wrapped command itself is a validation command (e.g., run_and_report typecheck make typecheck).
The create-file branch in file_edit_insert.ts bypassed executeFileEditOperation, so new files weren't counted by the edit tracker. This meant a stream that only created files could skip verification.
Match validation commands like 'cd packages/app && make test' and 'source .env; bun test' by accepting shell operators as command prefixes in addition to line start.
Remove .*? from the run_and_report pattern so that only the third word (the actual command) is checked against validation patterns. Chained commands after && are already caught by the standalone pattern.
Shell command parsing with regex is inherently imperfect. The escape hatch (second agent_report always passes) covers false negatives from env prefixes and shell wrappers.
16a071c to
b5757d4
Compare
|
@codex review |
1 similar comment
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b5757d4d7e
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b5757d4d7e
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 95d45718c2
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Background bash commands (run_in_background=true) haven't produced results yet when they start, so they shouldn't count as 'validation attempted' for the pre-completion verification guard.
95d4571 to
07e9e67
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 07e9e67fcd
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
…n't count Previously, running make test before editing files would permanently satisfy the verification guard. Now file edits reset the validation tracker, ensuring only post-edit validation counts.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f8b64a38ff
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f8b64a38ff
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Note for reviewersCodex keeps raising the same P1/P2 about tracking bash-based file edits (redirections, This is an intentional design decision:
All other Codex feedback was addressed with code changes:
CI checks are all passing. Ready for human review. |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f8b64a38ff
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Summary
Adds two deterministic harness guardrails to the agent loop that enforce better agent behavior via the tool pipeline (not just prompting):
agent_reportto reject completion when the agent edited files but never ran any validation commands (tests, typecheck, lint). Allows through on a second attempt as an escape hatch.Implementation
New per-stream tracker classes
StreamEditTracker— counts edits per file path, supports one-time nudge per file per streamStreamVerificationTracker— tracks whether any validation-like bash commands were run, with a one-time nudge-then-allow-through lifecycleBoth are instantiated per-stream in
aiService.tsand threaded throughToolConfigurationto tool factories.Verification guard (
agent_report){ success: true }, checks if edits occurred and no validation was attemptedmake test,bun test,vitest,tsc,run_and_report, etc.)Doom-loop nudge (
file_edit_operation)<notification>via__mux_notifications(model-only, stripped before UI/persistence)Safety
ToolConfiguration— IPC tool calls without trackers see zero change__mux_notificationsinfrastructure (already tested for stripping before persistence/UI)Validation
StreamEditTracker,StreamVerificationTracker,agent_report,bash,file_edit_operation)make typecheck✅make lint✅make fmt-check✅Generated with
mux• Model:anthropic:claude-opus-4-6• Thinking:xhigh• Cost:$1.44